Overview

Dataset statistics

Number of variables27
Number of observations2719136
Missing cells941627
Missing cells (%)1.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.9 GiB
Average record size in memory1.1 KiB

Variable types

Categorical14
Numeric12
Boolean1

Alerts

Category has constant value "DRUGS"Constant
SpecialisationName has a high cardinality: 95 distinct valuesHigh cardinality
BillNo has a high cardinality: 1099192 distinct valuesHigh cardinality
BillDate has a high cardinality: 1527 distinct valuesHigh cardinality
ItemName has a high cardinality: 4364 distinct valuesHigh cardinality
Item_Code has a high cardinality: 4223 distinct valuesHigh cardinality
GenericName has a high cardinality: 2962 distinct valuesHigh cardinality
UHID is highly overall correlated with Bill_YearHigh correlation
TQty is highly overall correlated with ReturnMRPHigh correlation
UCPwithoutGST is highly overall correlated with MRP and 2 other fieldsHigh correlation
PurGSTPer is highly overall correlated with TotalDiscountHigh correlation
MRP is highly overall correlated with UCPwithoutGST and 2 other fieldsHigh correlation
TotalCost is highly overall correlated with UCPwithoutGST and 2 other fieldsHigh correlation
TotalDiscount is highly overall correlated with PurGSTPerHigh correlation
NetSales is highly overall correlated with UCPwithoutGST and 3 other fieldsHigh correlation
ReturnMRP is highly overall correlated with TQty and 1 other fieldsHigh correlation
Bill_Month is highly overall correlated with Bill_WeekHigh correlation
Bill_Week is highly overall correlated with Bill_MonthHigh correlation
SalesType is highly overall correlated with PharmacyTypeHigh correlation
PharmacyType is highly overall correlated with SalesType and 1 other fieldsHigh correlation
Department is highly overall correlated with PharmacyTypeHigh correlation
IsFormulary is highly overall correlated with FormularyHigh correlation
Formulary is highly overall correlated with IsFormularyHigh correlation
Bill_Year is highly overall correlated with UHIDHigh correlation
SalesType is highly imbalanced (50.7%)Imbalance
Department is highly imbalanced (73.6%)Imbalance
IsFormulary is highly imbalanced (51.3%)Imbalance
Formulary is highly imbalanced (61.0%)Imbalance
SubCategory is highly imbalanced (53.9%)Imbalance
UHID has 156637 (5.8%) missing valuesMissing
Department has 442465 (16.3%) missing valuesMissing
Formulary has 332707 (12.2%) missing valuesMissing
UCPwithoutGST is highly skewed (γ1 = 63.87752404)Skewed
MRP is highly skewed (γ1 = 50.22161308)Skewed
TotalCost is highly skewed (γ1 = 39.50248905)Skewed
TotalDiscount is highly skewed (γ1 = 285.8811301)Skewed
NetSales is highly skewed (γ1 = 31.18778957)Skewed
ReturnMRP is highly skewed (γ1 = 42.28664228)Skewed
PurGSTPer has 2105336 (77.4%) zerosZeros
TotalDiscount has 2321505 (85.4%) zerosZeros
NetSales has 269495 (9.9%) zerosZeros
ReturnMRP has 2450236 (90.1%) zerosZeros

Reproduction

Analysis started2023-05-08 13:34:48.098751
Analysis finished2023-05-08 13:39:10.250625
Duration4 minutes and 22.15 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

SalesType
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size197.1 MiB
IP Dispense
1843450 
OTC Dispense
576578 
IP Return
261364 
OP Dispense
 
30203
OTC Return
 
7347

Length

Max length12
Median length11
Mean length11.016959
Min length9

Characters and Unicode

Total characters29956611
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIP Dispense
2nd rowIP Dispense
3rd rowIP Dispense
4th rowIP Dispense
5th rowIP Dispense

Common Values

ValueCountFrequency (%)
IP Dispense 1843450
67.8%
OTC Dispense 576578
 
21.2%
IP Return 261364
 
9.6%
OP Dispense 30203
 
1.1%
OTC Return 7347
 
0.3%
OP Return 194
 
< 0.1%

Length

2023-05-08T13:39:10.363891image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-08T13:39:10.563811image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
dispense 2450231
45.1%
ip 2104814
38.7%
otc 583925
 
10.7%
return 268905
 
4.9%
op 30397
 
0.6%

Most occurring characters

ValueCountFrequency (%)
e 5169367
17.3%
s 4900462
16.4%
2719136
9.1%
n 2719136
9.1%
D 2450231
8.2%
i 2450231
8.2%
p 2450231
8.2%
P 2135211
7.1%
I 2104814
7.0%
O 614322
 
2.1%
Other values (6) 2243470
7.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18496142
61.7%
Uppercase Letter 8741333
29.2%
Space Separator 2719136
 
9.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5169367
27.9%
s 4900462
26.5%
n 2719136
14.7%
i 2450231
13.2%
p 2450231
13.2%
t 268905
 
1.5%
u 268905
 
1.5%
r 268905
 
1.5%
Uppercase Letter
ValueCountFrequency (%)
D 2450231
28.0%
P 2135211
24.4%
I 2104814
24.1%
O 614322
 
7.0%
T 583925
 
6.7%
C 583925
 
6.7%
R 268905
 
3.1%
Space Separator
ValueCountFrequency (%)
2719136
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 27237475
90.9%
Common 2719136
 
9.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5169367
19.0%
s 4900462
18.0%
n 2719136
10.0%
D 2450231
9.0%
i 2450231
9.0%
p 2450231
9.0%
P 2135211
7.8%
I 2104814
7.7%
O 614322
 
2.3%
T 583925
 
2.1%
Other values (5) 1659545
 
6.1%
Common
ValueCountFrequency (%)
2719136
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29956611
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 5169367
17.3%
s 4900462
16.4%
2719136
9.1%
n 2719136
9.1%
D 2450231
8.2%
i 2450231
8.2%
p 2450231
8.2%
P 2135211
7.1%
I 2104814
7.0%
O 614322
 
2.1%
Other values (6) 2243470
7.5%

UHID
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct58130
Distinct (%)2.3%
Missing156637
Missing (%)5.8%
Infinite0
Infinite (%)0.0%
Mean1.2018057 × 1010
Minimum1.2018 × 1010
Maximum1.2018136 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size41.5 MiB
2023-05-08T13:39:10.750643image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1.2018 × 1010
5-th percentile1.2018003 × 1010
Q11.2018022 × 1010
median1.2018052 × 1010
Q31.2018089 × 1010
95-th percentile1.2018123 × 1010
Maximum1.2018136 × 1010
Range135522
Interquartile range (IQR)67741

Descriptive statistics

Standard deviation38728.754
Coefficient of variation (CV)3.2225471 × 10-6
Kurtosis-1.1398444
Mean1.2018057 × 1010
Median Absolute Deviation (MAD)33026
Skewness0.26990207
Sum3.0796258 × 1016
Variance1.4999164 × 109
MonotonicityNot monotonic
2023-05-08T13:39:10.974102image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.201806711 × 10103556
 
0.1%
1.201800818 × 10103356
 
0.1%
1.20180057 × 10103066
 
0.1%
1.201801635 × 10102416
 
0.1%
1.201807952 × 10102311
 
0.1%
1.201808741 × 10102306
 
0.1%
1.201807345 × 10102228
 
0.1%
1.201804314 × 10102195
 
0.1%
1.201810332 × 10102149
 
0.1%
1.201800347 × 10102149
 
0.1%
Other values (58120) 2536767
93.3%
(Missing) 156637
 
5.8%
ValueCountFrequency (%)
1.2018 × 10107
 
< 0.1%
1.2018 × 101015
 
< 0.1%
1.2018 × 10108
 
< 0.1%
1.2018 × 101011
 
< 0.1%
1.201800001 × 10103
 
< 0.1%
1.201800001 × 10106
 
< 0.1%
1.201800001 × 10104
 
< 0.1%
1.201800001 × 10104
 
< 0.1%
1.201800001 × 1010185
< 0.1%
1.201800001 × 101074
 
< 0.1%
ValueCountFrequency (%)
1.201813552 × 10102
 
< 0.1%
1.201813552 × 10103
 
< 0.1%
1.201813552 × 10102
 
< 0.1%
1.201813552 × 10105
< 0.1%
1.201813552 × 101011
< 0.1%
1.201813552 × 10107
< 0.1%
1.201813552 × 10108
< 0.1%
1.201813551 × 10101
 
< 0.1%
1.201813551 × 10108
< 0.1%
1.201813551 × 10104
 
< 0.1%

PharmacyType
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size173.9 MiB
IP
2104814 
OP
430077 
OTC
 
184245

Length

Max length3
Median length2
Mean length2.0677587
Min length2

Characters and Unicode

Total characters5622517
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIP
2nd rowIP
3rd rowIP
4th rowIP
5th rowIP

Common Values

ValueCountFrequency (%)
IP 2104814
77.4%
OP 430077
 
15.8%
OTC 184245
 
6.8%

Length

2023-05-08T13:39:11.159197image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-08T13:39:11.314733image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
ip 2104814
77.4%
op 430077
 
15.8%
otc 184245
 
6.8%

Most occurring characters

ValueCountFrequency (%)
P 2534891
45.1%
I 2104814
37.4%
O 614322
 
10.9%
T 184245
 
3.3%
C 184245
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 5622517
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 2534891
45.1%
I 2104814
37.4%
O 614322
 
10.9%
T 184245
 
3.3%
C 184245
 
3.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 5622517
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 2534891
45.1%
I 2104814
37.4%
O 614322
 
10.9%
T 184245
 
3.3%
C 184245
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5622517
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 2534891
45.1%
I 2104814
37.4%
O 614322
 
10.9%
T 184245
 
3.3%
C 184245
 
3.3%
Distinct95
Distinct (%)< 0.1%
Missing50
Missing (%)< 0.1%
Memory size224.5 MiB
Liver Disease and Transplantation
667071 
Hepatology
413521 
Covid Care Team
248655 
Internal Medicine and Diabetology
156780 
Cardiology
 
93219
Other values (90)
1139840 

Length

Max length62
Median length53
Mean length21.283485
Min length4

Characters and Unicode

Total characters57871626
Distinct characters51
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowLiver Disease and Transplantation
2nd rowLiver Disease and Transplantation
3rd rowLiver Disease and Transplantation
4th rowLiver Disease and Transplantation
5th rowLiver Disease and Transplantation

Common Values

ValueCountFrequency (%)
Liver Disease and Transplantation 667071
24.5%
Hepatology 413521
15.2%
Covid Care Team 248655
 
9.1%
Internal Medicine and Diabetology 156780
 
5.8%
Cardiology 93219
 
3.4%
Emergency Medicine 90025
 
3.3%
Obstetrics & Gynaecology 80036
 
2.9%
Medical Oncology 79547
 
2.9%
Neurosurgery 72738
 
2.7%
Pediatric Hemato Oncology 71929
 
2.6%
Other values (85) 745565
27.4%

Length

2023-05-08T13:39:11.468329image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
and 950749
13.4%
transplantation 706016
 
10.0%
liver 697700
 
9.9%
disease 672651
 
9.5%
hepatology 481559
 
6.8%
medicine 347003
 
4.9%
care 296137
 
4.2%
covid 248655
 
3.5%
team 248655
 
3.5%
internal 204067
 
2.9%
Other values (106) 2218598
31.4%

Most occurring characters

ValueCountFrequency (%)
a 6271926
 
10.8%
e 5744754
 
9.9%
n 4601010
 
8.0%
i 4355759
 
7.5%
4352704
 
7.5%
o 4037067
 
7.0%
r 3359113
 
5.8%
t 3300373
 
5.7%
s 2651480
 
4.6%
l 2477952
 
4.3%
Other values (41) 16719488
28.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 47219035
81.6%
Uppercase Letter 6100200
 
10.5%
Space Separator 4352704
 
7.5%
Other Punctuation 181425
 
0.3%
Dash Punctuation 18248
 
< 0.1%
Open Punctuation 7
 
< 0.1%
Close Punctuation 7
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6271926
13.3%
e 5744754
12.2%
n 4601010
9.7%
i 4355759
9.2%
o 4037067
8.5%
r 3359113
7.1%
t 3300373
 
7.0%
s 2651480
 
5.6%
l 2477952
 
5.2%
d 2095912
 
4.4%
Other values (13) 8323689
17.6%
Uppercase Letter
ValueCountFrequency (%)
T 962103
15.8%
D 855870
14.0%
L 711566
11.7%
C 687503
11.3%
H 594065
9.7%
M 535945
8.8%
O 346557
 
5.7%
I 296662
 
4.9%
P 288564
 
4.7%
N 182141
 
3.0%
Other values (11) 639224
10.5%
Other Punctuation
ValueCountFrequency (%)
& 159440
87.9%
, 21985
 
12.1%
Dash Punctuation
ValueCountFrequency (%)
- 9350
51.2%
– 8898
48.8%
Space Separator
ValueCountFrequency (%)
4352704
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 53319235
92.1%
Common 4552391
 
7.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6271926
11.8%
e 5744754
 
10.8%
n 4601010
 
8.6%
i 4355759
 
8.2%
o 4037067
 
7.6%
r 3359113
 
6.3%
t 3300373
 
6.2%
s 2651480
 
5.0%
l 2477952
 
4.6%
d 2095912
 
3.9%
Other values (34) 14423889
27.1%
Common
ValueCountFrequency (%)
4352704
95.6%
& 159440
 
3.5%
, 21985
 
0.5%
- 9350
 
0.2%
– 8898
 
0.2%
( 7
 
< 0.1%
) 7
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 57862728
> 99.9%
Punctuation 8898
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6271926
 
10.8%
e 5744754
 
9.9%
n 4601010
 
8.0%
i 4355759
 
7.5%
4352704
 
7.5%
o 4037067
 
7.0%
r 3359113
 
5.8%
t 3300373
 
5.7%
s 2651480
 
4.6%
l 2477952
 
4.3%
Other values (40) 16710590
28.9%
Punctuation
ValueCountFrequency (%)
– 8898
100.0%

Department
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct12
Distinct (%)< 0.1%
Missing442465
Missing (%)16.3%
Memory size189.1 MiB
MS IP Pharmacy
1917738 
SS OT Pharmacy
 
181524
New MS OP Pharmacy-GF
 
82630
SS ORAGADAM.
 
34088
SS Cath Lab Store
 
26049
Other values (7)
 
34642

Length

Max length28
Median length14
Mean length14.331921
Min length12

Characters and Unicode

Total characters32629070
Distinct characters38
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMS IP Pharmacy
2nd rowMS IP Pharmacy
3rd rowMS IP Pharmacy
4th rowMS IP Pharmacy
5th rowMS IP Pharmacy

Common Values

ValueCountFrequency (%)
MS IP Pharmacy 1917738
70.5%
SS OT Pharmacy 181524
 
6.7%
New MS OP Pharmacy-GF 82630
 
3.0%
SS ORAGADAM. 34088
 
1.3%
SS Cath Lab Store 26049
 
1.0%
New MS OP Pharmacy-2F 23958
 
0.9%
SS COVID STORE 9864
 
0.4%
SS OP VACCINE 781
 
< 0.1%
MS Flu ER Dispensing Counter 23
 
< 0.1%
MS OP Pharmacy-GF 9
 
< 0.1%
Other values (2) 7
 
< 0.1%
(Missing) 442465
 
16.3%

Length

2023-05-08T13:39:11.637704image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
pharmacy 2099262
30.3%
ms 2024365
29.2%
ip 1917740
27.7%
ss 252306
 
3.6%
ot 181524
 
2.6%
op 107383
 
1.5%
new 106588
 
1.5%
pharmacy-gf 82639
 
1.2%
store 35913
 
0.5%
oragadam 34088
 
0.5%
Other values (10) 86800
 
1.3%

Most occurring characters

ValueCountFrequency (%)
4651937
14.3%
a 4463830
13.7%
P 4230989
13.0%
S 2564890
7.9%
r 2231938
6.8%
h 2231915
6.8%
y 2205866
6.8%
c 2205866
6.8%
m 2205866
6.8%
M 2058453
6.3%
Other values (28) 3577520
11.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15889074
48.7%
Uppercase Letter 11923406
36.5%
Space Separator 4651937
 
14.3%
Dash Punctuation 106602
 
0.3%
Other Punctuation 34088
 
0.1%
Decimal Number 23963
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4463830
28.1%
r 2231938
14.0%
h 2231915
14.0%
y 2205866
13.9%
c 2205866
13.9%
m 2205866
13.9%
e 132683
 
0.8%
w 106588
 
0.7%
t 52121
 
0.3%
o 26074
 
0.2%
Other values (8) 26327
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
P 4230989
35.5%
S 2564890
21.5%
M 2058453
17.3%
I 1928385
16.2%
O 342723
 
2.9%
T 191388
 
1.6%
G 116727
 
1.0%
N 107369
 
0.9%
F 106625
 
0.9%
A 103045
 
0.9%
Other values (6) 172812
 
1.4%
Space Separator
ValueCountFrequency (%)
4651937
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 106602
100.0%
Other Punctuation
ValueCountFrequency (%)
. 34088
100.0%
Decimal Number
ValueCountFrequency (%)
2 23963
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 27812480
85.2%
Common 4816590
 
14.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4463830
16.0%
P 4230989
15.2%
S 2564890
9.2%
r 2231938
8.0%
h 2231915
8.0%
y 2205866
7.9%
c 2205866
7.9%
m 2205866
7.9%
M 2058453
7.4%
I 1928385
6.9%
Other values (24) 1484482
 
5.3%
Common
ValueCountFrequency (%)
4651937
96.6%
- 106602
 
2.2%
. 34088
 
0.7%
2 23963
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32629070
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4651937
14.3%
a 4463830
13.7%
P 4230989
13.0%
S 2564890
7.9%
r 2231938
6.8%
h 2231915
6.8%
y 2205866
6.8%
c 2205866
6.8%
m 2205866
6.8%
M 2058453
6.3%
Other values (28) 3577520
11.0%

BillNo
Categorical

Distinct1099192
Distinct (%)40.4%
Missing0
Missing (%)0.0%
Memory size199.4 MiB
OPPDS22-04858
 
190
OTCPDS22-04858
 
56
PDS21-141817
 
51
PDS20-164638
 
50
PDS21-51030
 
46
Other values (1099187)
2718743 

Length

Max length14
Median length13
Mean length11.891973
Min length11

Characters and Unicode

Total characters32335893
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique597430 ?
Unique (%)22.0%

Sample

1st rowPDS19-08659
2nd rowPDS19-08923
3rd rowPDS19-08981
4th rowPDS19-08981
5th rowPDS19-08981

Common Values

ValueCountFrequency (%)
OPPDS22-04858 190
 
< 0.1%
OTCPDS22-04858 56
 
< 0.1%
PDS21-141817 51
 
< 0.1%
PDS20-164638 50
 
< 0.1%
PDS21-51030 46
 
< 0.1%
OPPDS21-10404 44
 
< 0.1%
PDS20-151216 43
 
< 0.1%
PDS20-136208 42
 
< 0.1%
PDS20-131943 42
 
< 0.1%
OTCPDS19-05176 42
 
< 0.1%
Other values (1099182) 2718530
> 99.9%

Length

2023-05-08T13:39:11.883207image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
oppds22-04858 190
 
< 0.1%
otcpds22-04858 56
 
< 0.1%
pds21-141817 51
 
< 0.1%
pds20-164638 50
 
< 0.1%
pds21-51030 46
 
< 0.1%
oppds21-10404 44
 
< 0.1%
pds20-151216 43
 
< 0.1%
pds20-131943 42
 
< 0.1%
otcpds19-05176 42
 
< 0.1%
pds20-136208 42
 
< 0.1%
Other values (1099182) 2718530
> 99.9%

Most occurring characters

ValueCountFrequency (%)
2 4765257
14.7%
1 3665094
11.3%
P 2845832
8.8%
S 2719136
8.4%
- 2719136
8.4%
D 2450546
 
7.6%
0 1993691
 
6.2%
9 1748097
 
5.4%
3 1456931
 
4.5%
4 1374098
 
4.2%
Other values (9) 6598075
20.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 20123962
62.2%
Uppercase Letter 9492795
29.4%
Dash Punctuation 2719136
 
8.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 4765257
23.7%
1 3665094
18.2%
0 1993691
9.9%
9 1748097
 
8.7%
3 1456931
 
7.2%
4 1374098
 
6.8%
5 1322804
 
6.6%
6 1290330
 
6.4%
7 1257186
 
6.2%
8 1250474
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
P 2845832
30.0%
S 2719136
28.6%
D 2450546
25.8%
O 576891
 
6.1%
L 268590
 
2.8%
R 268590
 
2.8%
C 181605
 
1.9%
T 181605
 
1.9%
Dash Punctuation
ValueCountFrequency (%)
- 2719136
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22843098
70.6%
Latin 9492795
29.4%

Most frequent character per script

Common
ValueCountFrequency (%)
2 4765257
20.9%
1 3665094
16.0%
- 2719136
11.9%
0 1993691
8.7%
9 1748097
 
7.7%
3 1456931
 
6.4%
4 1374098
 
6.0%
5 1322804
 
5.8%
6 1290330
 
5.6%
7 1257186
 
5.5%
Latin
ValueCountFrequency (%)
P 2845832
30.0%
S 2719136
28.6%
D 2450546
25.8%
O 576891
 
6.1%
L 268590
 
2.8%
R 268590
 
2.8%
C 181605
 
1.9%
T 181605
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32335893
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 4765257
14.7%
1 3665094
11.3%
P 2845832
8.8%
S 2719136
8.4%
- 2719136
8.4%
D 2450546
 
7.6%
0 1993691
 
6.2%
9 1748097
 
5.4%
3 1456931
 
4.5%
4 1374098
 
4.2%
Other values (9) 6598075
20.4%

BillDate
Categorical

Distinct1527
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size194.5 MiB
2022-12-20
 
3804
2023-02-15
 
3778
2023-02-03
 
3706
2023-02-24
 
3694
2023-02-07
 
3655
Other values (1522)
2700499 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters27191360
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019-01-30
2nd row2019-01-31
3rd row2019-01-31
4th row2019-01-31
5th row2019-01-31

Common Values

ValueCountFrequency (%)
2022-12-20 3804
 
0.1%
2023-02-15 3778
 
0.1%
2023-02-03 3706
 
0.1%
2023-02-24 3694
 
0.1%
2023-02-07 3655
 
0.1%
2023-03-01 3631
 
0.1%
2023-01-24 3595
 
0.1%
2023-02-16 3581
 
0.1%
2022-12-15 3561
 
0.1%
2023-03-04 3543
 
0.1%
Other values (1517) 2682588
98.7%

Length

2023-05-08T13:39:12.044395image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2022-12-20 3804
 
0.1%
2023-02-15 3778
 
0.1%
2023-02-03 3706
 
0.1%
2023-02-24 3694
 
0.1%
2023-02-07 3655
 
0.1%
2023-03-01 3631
 
0.1%
2023-01-24 3595
 
0.1%
2023-02-16 3581
 
0.1%
2022-12-15 3561
 
0.1%
2023-03-04 3543
 
0.1%
Other values (1517) 2682588
98.7%

Most occurring characters

ValueCountFrequency (%)
2 7580107
27.9%
0 6486672
23.9%
- 5438272
20.0%
1 3587656
13.2%
9 918831
 
3.4%
3 820265
 
3.0%
8 487728
 
1.8%
5 475930
 
1.8%
7 469444
 
1.7%
4 466863
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 21753088
80.0%
Dash Punctuation 5438272
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 7580107
34.8%
0 6486672
29.8%
1 3587656
16.5%
9 918831
 
4.2%
3 820265
 
3.8%
8 487728
 
2.2%
5 475930
 
2.2%
7 469444
 
2.2%
4 466863
 
2.1%
6 459592
 
2.1%
Dash Punctuation
ValueCountFrequency (%)
- 5438272
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 27191360
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 7580107
27.9%
0 6486672
23.9%
- 5438272
20.0%
1 3587656
13.2%
9 918831
 
3.4%
3 820265
 
3.0%
8 487728
 
1.8%
5 475930
 
1.8%
7 469444
 
1.7%
4 466863
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27191360
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 7580107
27.9%
0 6486672
23.9%
- 5438272
20.0%
1 3587656
13.2%
9 918831
 
3.4%
3 820265
 
3.0%
8 487728
 
1.8%
5 475930
 
1.8%
7 469444
 
1.7%
4 466863
 
1.7%

ItemName
Categorical

Distinct4364
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size280.6 MiB
SODIUM CHLORIDE 0.9% 100MLCLARIS
 
93660
EMESET (ONDANSETRON) 2ML INJ CIPLA
 
58981
KABILYTE (MULTIPLE ELECTROLYTES) 500ML FLEXBAG FRESENIUS
 
56517
WATER FOR INJECTION 10ML CLARIS
 
49640
SODIUM CHLORIDE 0.9% 500ML (UNIBAG ) CLARIS
 
42158
Other values (4359)
2418180 

Length

Max length186
Median length118
Mean length43.159531
Min length12

Characters and Unicode

Total characters117356635
Distinct characters70
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique241 ?
Unique (%)< 0.1%

Sample

1st rowNEKSIUM (ESOMEPRAZOLE) 20MG TAB 1x10 PFIZER
2nd rowFENSTUD (FENTANYL) 50MCG/10ML INJ RUSAN
3rd rowCALCIUM GLUCONATE 10ML INJ 1X1 HINDU
4th rowCLOHEX MOUTH WASH 150ML DR REDDYS
5th rowDEXTROSE 5% 500ML CLARIS

Common Values

ValueCountFrequency (%)
SODIUM CHLORIDE 0.9% 100MLCLARIS 93660
 
3.4%
EMESET (ONDANSETRON) 2ML INJ CIPLA 58981
 
2.2%
KABILYTE (MULTIPLE ELECTROLYTES) 500ML FLEXBAG FRESENIUS 56517
 
2.1%
WATER FOR INJECTION 10ML CLARIS 49640
 
1.8%
SODIUM CHLORIDE 0.9% 500ML (UNIBAG ) CLARIS 42158
 
1.6%
SODIUM CHLORIDE 0.9% 100ML FREEFLEX BAG FRESENIUS KABI 37437
 
1.4%
PANSEC (PANTOPRAZOLE) 40MG INJ CIPLA 33705
 
1.2%
LOX JELLY (LIGNOCAINE HYDROCHLORIDE) 2% 30GM GEL NEON 31712
 
1.2%
PARACIP (PARACETAMOL 1GM) 100ML IV INJ CIPLA 26982
 
1.0%
PARIDEM (PANTOPRAZOLE) 40MG INJ EMCURE 26763
 
1.0%
Other values (4354) 2261581
83.2%

Length

2023-05-08T13:39:12.268069image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
inj 1103097
 
6.4%
tab 541363
 
3.2%
506197
 
2.9%
sodium 337340
 
2.0%
cipla 322268
 
1.9%
chloride 318692
 
1.9%
neon 291506
 
1.7%
1x10 284371
 
1.7%
0.9 258831
 
1.5%
100ml 237819
 
1.4%
Other values (5754) 12964845
75.5%

Most occurring characters

ValueCountFrequency (%)
14783213
 
12.6%
E 7878108
 
6.7%
A 7797036
 
6.6%
I 7659471
 
6.5%
L 6837224
 
5.8%
N 6283348
 
5.4%
O 5904127
 
5.0%
M 5846812
 
5.0%
R 4980870
 
4.2%
T 4676856
 
4.0%
Other values (60) 44709570
38.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 86695661
73.9%
Space Separator 14783839
 
12.6%
Decimal Number 9425188
 
8.0%
Open Punctuation 2170817
 
1.8%
Close Punctuation 2169438
 
1.8%
Other Punctuation 1130874
 
1.0%
Lowercase Letter 507917
 
0.4%
Math Symbol 407600
 
0.3%
Dash Punctuation 65252
 
0.1%
Connector Punctuation 31
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 7878108
 
9.1%
A 7797036
 
9.0%
I 7659471
 
8.8%
L 6837224
 
7.9%
N 6283348
 
7.2%
O 5904127
 
6.8%
M 5846812
 
6.7%
R 4980870
 
5.7%
T 4676856
 
5.4%
C 4159306
 
4.8%
Other values (17) 24672503
28.5%
Lowercase Letter
ValueCountFrequency (%)
x 490326
96.5%
r 7663
 
1.5%
g 3415
 
0.7%
m 3370
 
0.7%
s 441
 
0.1%
c 321
 
0.1%
o 319
 
0.1%
e 319
 
0.1%
t 319
 
0.1%
f 319
 
0.1%
Other values (7) 1105
 
0.2%
Decimal Number
ValueCountFrequency (%)
0 3745782
39.7%
1 2373015
25.2%
5 1390271
 
14.8%
2 738786
 
7.8%
4 443735
 
4.7%
9 274766
 
2.9%
3 258011
 
2.7%
6 124398
 
1.3%
7 48084
 
0.5%
8 28340
 
0.3%
Other Punctuation
ValueCountFrequency (%)
. 576541
51.0%
% 436838
38.6%
/ 104854
 
9.3%
& 9706
 
0.9%
, 2935
 
0.3%
Space Separator
ValueCountFrequency (%)
14783213
> 99.9%
  626
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 2170810
> 99.9%
[ 7
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 2169431
> 99.9%
] 7
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
+ 407589
> 99.9%
| 11
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 65252
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 31
100.0%
Final Punctuation
ValueCountFrequency (%)
’ 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 87203578
74.3%
Common 30153057
 
25.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 7878108
 
9.0%
A 7797036
 
8.9%
I 7659471
 
8.8%
L 6837224
 
7.8%
N 6283348
 
7.2%
O 5904127
 
6.8%
M 5846812
 
6.7%
R 4980870
 
5.7%
T 4676856
 
5.4%
C 4159306
 
4.8%
Other values (34) 25180420
28.9%
Common
ValueCountFrequency (%)
14783213
49.0%
0 3745782
 
12.4%
1 2373015
 
7.9%
( 2170810
 
7.2%
) 2169431
 
7.2%
5 1390271
 
4.6%
2 738786
 
2.5%
. 576541
 
1.9%
4 443735
 
1.5%
% 436838
 
1.4%
Other values (16) 1324635
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 117355174
> 99.9%
None 1443
 
< 0.1%
Punctuation 18
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
14783213
 
12.6%
E 7878108
 
6.7%
A 7797036
 
6.6%
I 7659471
 
6.5%
L 6837224
 
5.8%
N 6283348
 
5.4%
O 5904127
 
5.0%
M 5846812
 
5.0%
R 4980870
 
4.2%
T 4676856
 
4.0%
Other values (57) 44708109
38.1%
None
ValueCountFrequency (%)
 817
56.6%
  626
43.4%
Punctuation
ValueCountFrequency (%)
’ 18
100.0%

Item_Code
Categorical

Distinct4223
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size189.3 MiB
PH000579
 
93660
PH000258
 
58981
PH000561
 
56517
PH000439
 
49640
PH000569
 
42158
Other values (4218)
2418180 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters21753088
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique219 ?
Unique (%)< 0.1%

Sample

1st rowPH001036
2nd rowPH001660
3rd rowPH000543
4th rowPH000604
5th rowPH000564

Common Values

ValueCountFrequency (%)
PH000579 93660
 
3.4%
PH000258 58981
 
2.2%
PH000561 56517
 
2.1%
PH000439 49640
 
1.8%
PH000569 42158
 
1.6%
PH002839 37441
 
1.4%
PH000295 33705
 
1.2%
PH000208 31712
 
1.2%
PH000589 26982
 
1.0%
PH003917 26763
 
1.0%
Other values (4213) 2261577
83.2%

Length

2023-05-08T13:39:12.456306image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ph000579 93660
 
3.4%
ph000258 58981
 
2.2%
ph000561 56517
 
2.1%
ph000439 49640
 
1.8%
ph000569 42158
 
1.6%
ph002839 37441
 
1.4%
ph000295 33705
 
1.2%
ph000208 31712
 
1.2%
ph000589 26982
 
1.0%
ph003917 26763
 
1.0%
Other values (4213) 2261577
83.2%

Most occurring characters

ValueCountFrequency (%)
0 7757521
35.7%
P 2719123
 
12.5%
H 2719123
 
12.5%
1 1260130
 
5.8%
3 1112357
 
5.1%
5 1011620
 
4.7%
2 991764
 
4.6%
9 943591
 
4.3%
8 894900
 
4.1%
6 879765
 
4.0%
Other values (4) 1463194
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16314816
75.0%
Uppercase Letter 5438272
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7757521
47.5%
1 1260130
 
7.7%
3 1112357
 
6.8%
5 1011620
 
6.2%
2 991764
 
6.1%
9 943591
 
5.8%
8 894900
 
5.5%
6 879765
 
5.4%
4 747476
 
4.6%
7 715692
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
P 2719123
50.0%
H 2719123
50.0%
M 13
 
< 0.1%
C 13
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 16314816
75.0%
Latin 5438272
 
25.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 7757521
47.5%
1 1260130
 
7.7%
3 1112357
 
6.8%
5 1011620
 
6.2%
2 991764
 
6.1%
9 943591
 
5.8%
8 894900
 
5.5%
6 879765
 
5.4%
4 747476
 
4.6%
7 715692
 
4.4%
Latin
ValueCountFrequency (%)
P 2719123
50.0%
H 2719123
50.0%
M 13
 
< 0.1%
C 13
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21753088
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7757521
35.7%
P 2719123
 
12.5%
H 2719123
 
12.5%
1 1260130
 
5.8%
3 1112357
 
5.1%
5 1011620
 
4.7%
2 991764
 
4.6%
9 943591
 
4.3%
8 894900
 
4.1%
6 879765
 
4.0%
Other values (4) 1463194
 
6.7%

IsFormulary
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size23.3 MiB
True
2431846 
False
287290 
ValueCountFrequency (%)
True 2431846
89.4%
False 287290
 
10.6%
2023-05-08T13:39:12.609742image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Formulary
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct6
Distinct (%)< 0.1%
Missing332707
Missing (%)12.2%
Memory size165.7 MiB
P1
1811418 
P2
474186 
Innovator
 
72024
P3
 
16798
INNOVATOR
 
11418

Length

Max length9
Median length2
Mean length2.2454919
Min length2

Characters and Unicode

Total characters5358707
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowP1
2nd rowP2
3rd rowP1
4th rowP1
5th rowP1

Common Values

ValueCountFrequency (%)
P1 1811418
66.6%
P2 474186
 
17.4%
Innovator 72024
 
2.6%
P3 16798
 
0.6%
INNOVATOR 11418
 
0.4%
198.5 585
 
< 0.1%
(Missing) 332707
 
12.2%

Length

2023-05-08T13:39:12.758423image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-08T13:39:12.946001image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
p1 1811418
75.9%
p2 474186
 
19.9%
innovator 83442
 
3.5%
p3 16798
 
0.7%
198.5 585
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
P 2302402
43.0%
1 1812003
33.8%
2 474186
 
8.8%
n 144048
 
2.7%
o 144048
 
2.7%
I 83442
 
1.6%
v 72024
 
1.3%
a 72024
 
1.3%
t 72024
 
1.3%
r 72024
 
1.3%
Other values (11) 110482
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2477188
46.2%
Decimal Number 2304742
43.0%
Lowercase Letter 576192
 
10.8%
Other Punctuation 585
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 2302402
92.9%
I 83442
 
3.4%
O 22836
 
0.9%
N 22836
 
0.9%
V 11418
 
0.5%
A 11418
 
0.5%
T 11418
 
0.5%
R 11418
 
0.5%
Decimal Number
ValueCountFrequency (%)
1 1812003
78.6%
2 474186
 
20.6%
3 16798
 
0.7%
9 585
 
< 0.1%
8 585
 
< 0.1%
5 585
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
n 144048
25.0%
o 144048
25.0%
v 72024
12.5%
a 72024
12.5%
t 72024
12.5%
r 72024
12.5%
Other Punctuation
ValueCountFrequency (%)
. 585
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3053380
57.0%
Common 2305327
43.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 2302402
75.4%
n 144048
 
4.7%
o 144048
 
4.7%
I 83442
 
2.7%
v 72024
 
2.4%
a 72024
 
2.4%
t 72024
 
2.4%
r 72024
 
2.4%
O 22836
 
0.7%
N 22836
 
0.7%
Other values (4) 45672
 
1.5%
Common
ValueCountFrequency (%)
1 1812003
78.6%
2 474186
 
20.6%
3 16798
 
0.7%
9 585
 
< 0.1%
8 585
 
< 0.1%
. 585
 
< 0.1%
5 585
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5358707
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 2302402
43.0%
1 1812003
33.8%
2 474186
 
8.8%
n 144048
 
2.7%
o 144048
 
2.7%
I 83442
 
1.6%
v 72024
 
1.3%
a 72024
 
1.3%
t 72024
 
1.3%
r 72024
 
1.3%
Other values (11) 110482
 
2.1%

TQty
Real number (ℝ)

Distinct271
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.4365423
Minimum-450
Maximum1250
Zeros3
Zeros (%)< 0.1%
Negative268905
Negative (%)9.9%
Memory size41.5 MiB
2023-05-08T13:39:13.112074image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum-450
5-th percentile-2
Q11
median1
Q32
95-th percentile8
Maximum1250
Range1700
Interquartile range (IQR)1

Descriptive statistics

Standard deviation6.5954165
Coefficient of variation (CV)2.7068754
Kurtosis1164.0116
Mean2.4365423
Median Absolute Deviation (MAD)1
Skewness16.816578
Sum6625290
Variance43.499519
MonotonicityNot monotonic
2023-05-08T13:39:13.304453image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1266708
46.6%
2 524272
19.3%
3 220060
 
8.1%
4 143551
 
5.3%
-1 131909
 
4.9%
5 86702
 
3.2%
-2 70264
 
2.6%
6 56242
 
2.1%
10 43501
 
1.6%
-3 27657
 
1.0%
Other values (261) 148270
 
5.5%
ValueCountFrequency (%)
-450 1
 
< 0.1%
-378 1
 
< 0.1%
-360 1
 
< 0.1%
-300 2
 
< 0.1%
-182 1
 
< 0.1%
-180 4
< 0.1%
-170 1
 
< 0.1%
-164 1
 
< 0.1%
-160 1
 
< 0.1%
-120 8
< 0.1%
ValueCountFrequency (%)
1250 1
< 0.1%
1000 1
< 0.1%
750 1
< 0.1%
720 1
< 0.1%
600 1
< 0.1%
550 1
< 0.1%
540 1
< 0.1%
500 2
< 0.1%
450 1
< 0.1%
405 1
< 0.1%

UCPwithoutGST
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct19059
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean267.00547
Minimum0
Maximum431504
Zeros3355
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size41.5 MiB
2023-05-08T13:39:13.501475image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3.5
Q114.2
median47.46
Q3129.2725
95-th percentile879.29
Maximum431504
Range431504
Interquartile range (IQR)115.0725

Descriptive statistics

Standard deviation1473.7664
Coefficient of variation (CV)5.5196115
Kurtosis9644.2363
Mean267.00547
Median Absolute Deviation (MAD)38.46
Skewness63.877524
Sum7.2602418 × 108
Variance2171987.5
MonotonicityNot monotonic
2023-05-08T13:39:13.674165image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9.75 77539
 
2.9%
111 50399
 
1.9%
49.65 38699
 
1.4%
1.78 36130
 
1.3%
39 32875
 
1.2%
11 26733
 
1.0%
13.04 23206
 
0.9%
50 22718
 
0.8%
4.4 20404
 
0.8%
29.8 18120
 
0.7%
Other values (19049) 2372313
87.2%
ValueCountFrequency (%)
0 3355
0.1%
0.01 160
 
< 0.1%
0.02 2
 
< 0.1%
0.04 5
 
< 0.1%
0.06 35
 
< 0.1%
0.07 4
 
< 0.1%
0.1 154
 
< 0.1%
0.11 7
 
< 0.1%
0.13 2
 
< 0.1%
0.14 2
 
< 0.1%
ValueCountFrequency (%)
431504 1
 
< 0.1%
296879 6
 
< 0.1%
230999.99 6
 
< 0.1%
214678.53 2
 
< 0.1%
164026 1
 
< 0.1%
157800 15
< 0.1%
138297.6 1
 
< 0.1%
135000 2
 
< 0.1%
128000 13
< 0.1%
104517 2
 
< 0.1%

PurGSTPer
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct22
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.8100044
Minimum0
Maximum120
Zeros2105336
Zeros (%)77.4%
Negative0
Negative (%)0.0%
Memory size41.5 MiB
2023-05-08T13:39:13.839255image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile12
Maximum120
Range120
Interquartile range (IQR)0

Descriptive statistics

Standard deviation5.6036392
Coefficient of variation (CV)1.9941745
Kurtosis5.0249423
Mean2.8100044
Median Absolute Deviation (MAD)0
Skewness2.0252391
Sum7640784
Variance31.400772
MonotonicityNot monotonic
2023-05-08T13:39:14.005876image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
0 2105336
77.4%
12 480466
 
17.7%
5 59194
 
2.2%
24 33824
 
1.2%
18 31928
 
1.2%
10 4489
 
0.2%
36 3012
 
0.1%
48 512
 
< 0.1%
15 90
 
< 0.1%
20 72
 
< 0.1%
Other values (12) 213
 
< 0.1%
ValueCountFrequency (%)
0 2105336
77.4%
5 59194
 
2.2%
10 4489
 
0.2%
12 480466
 
17.7%
15 90
 
< 0.1%
17 2
 
< 0.1%
18 31928
 
1.2%
20 72
 
< 0.1%
24 33824
 
1.2%
25 12
 
< 0.1%
ValueCountFrequency (%)
120 2
 
< 0.1%
108 5
 
< 0.1%
96 3
 
< 0.1%
90 1
 
< 0.1%
84 3
 
< 0.1%
72 69
 
< 0.1%
60 31
 
< 0.1%
54 42
 
< 0.1%
48 512
 
< 0.1%
36 3012
0.1%

MRP
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct12954
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean497.4035
Minimum0.02
Maximum555416
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size41.5 MiB
2023-05-08T13:39:14.191748image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0.02
5-th percentile5.34
Q129.06
median80
Q3299
95-th percentile2465.1
Maximum555416
Range555415.98
Interquartile range (IQR)269.94

Descriptive statistics

Standard deviation2126.8404
Coefficient of variation (CV)4.2758855
Kurtosis6947.3743
Mean497.4035
Median Absolute Deviation (MAD)66.88
Skewness50.221613
Sum1.3525078 × 109
Variance4523450.2
MonotonicityNot monotonic
2023-05-08T13:39:14.371848image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17.65 35047
 
1.3%
17.33 28422
 
1.0%
37.92 26496
 
1.0%
42 24567
 
0.9%
602.55 21229
 
0.8%
284 16805
 
0.6%
2.56 16488
 
0.6%
259 16409
 
0.6%
17.74 16311
 
0.6%
42.01 16311
 
0.6%
Other values (12944) 2501051
92.0%
ValueCountFrequency (%)
0.02 2
 
< 0.1%
0.05 8
 
< 0.1%
0.08 35
< 0.1%
0.11 4
 
< 0.1%
0.16 7
 
< 0.1%
0.2 2
 
< 0.1%
0.21 2
 
< 0.1%
0.24 6
 
< 0.1%
0.34 4
 
< 0.1%
0.35 1
 
< 0.1%
ValueCountFrequency (%)
555416 1
 
< 0.1%
396725 6
< 0.1%
330000 6
< 0.1%
280000 2
 
< 0.1%
238999.99 2
 
< 0.1%
210400 6
< 0.1%
197250 9
< 0.1%
189000 2
 
< 0.1%
183000 1
 
< 0.1%
167370 1
 
< 0.1%

TotalCost
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct42297
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean495.12128
Minimum0
Maximum483284.48
Zeros3358
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size41.5 MiB
2023-05-08T13:39:14.558269image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6.85
Q127.24
median87.17
Q3248.64
95-th percentile1816.16
Maximum483284.48
Range483284.48
Interquartile range (IQR)221.4

Descriptive statistics

Standard deviation2512.7929
Coefficient of variation (CV)5.0751059
Kurtosis3562.6036
Mean495.12128
Median Absolute Deviation (MAD)69.93
Skewness39.502489
Sum1.3463021 × 109
Variance6314128.4
MonotonicityNot monotonic
2023-05-08T13:39:14.759489image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
43.68 30877
 
1.1%
21.84 27638
 
1.0%
10.92 25713
 
0.9%
12.32 19174
 
0.7%
55.61 18778
 
0.7%
248.64 16759
 
0.6%
32.76 16005
 
0.6%
124.32 15642
 
0.6%
111.22 14883
 
0.5%
24.64 14590
 
0.5%
Other values (42287) 2519077
92.6%
ValueCountFrequency (%)
0 3358
0.1%
0.01 12
 
< 0.1%
0.02 14
 
< 0.1%
0.03 22
 
< 0.1%
0.04 32
 
< 0.1%
0.06 10
 
< 0.1%
0.07 38
 
< 0.1%
0.08 45
 
< 0.1%
0.09 2
 
< 0.1%
0.1 1
 
< 0.1%
ValueCountFrequency (%)
483284.48 1
 
< 0.1%
401479.68 1
 
< 0.1%
376320 1
 
< 0.1%
363031.2 1
 
< 0.1%
353472 2
 
< 0.1%
332504.48 6
< 0.1%
301109.76 1
 
< 0.1%
292499.87 1
 
< 0.1%
263424 2
 
< 0.1%
258719.99 6
< 0.1%

TotalDiscount
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct13071
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.5991873
Minimum-662.56
Maximum79999.2
Zeros2321505
Zeros (%)85.4%
Negative61
Negative (%)< 0.1%
Memory size41.5 MiB
2023-05-08T13:39:14.962276image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum-662.56
5-th percentile0
Q10
median0
Q30
95-th percentile28.52
Maximum79999.2
Range80661.76
Interquartile range (IQR)0

Descriptive statistics

Standard deviation95.441009
Coefficient of variation (CV)11.098841
Kurtosis200970.97
Mean8.5991873
Median Absolute Deviation (MAD)0
Skewness285.88113
Sum23382360
Variance9108.9862
MonotonicityNot monotonic
2023-05-08T13:39:15.157782image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2321505
85.4%
13.8 2423
 
0.1%
10.5 2268
 
0.1%
142.5 2241
 
0.1%
14.9 1978
 
0.1%
3.09 1859
 
0.1%
23.9 1623
 
0.1%
19 1464
 
0.1%
24 1429
 
0.1%
11.95 1282
 
< 0.1%
Other values (13061) 381064
 
14.0%
ValueCountFrequency (%)
-662.56 1
 
< 0.1%
-322.82 1
 
< 0.1%
-316.05 1
 
< 0.1%
-237.48 1
 
< 0.1%
-161.4 3
< 0.1%
-147.78 1
 
< 0.1%
-129.26 1
 
< 0.1%
-121.06 1
 
< 0.1%
-110.85 1
 
< 0.1%
-106.52 1
 
< 0.1%
ValueCountFrequency (%)
79999.2 1
< 0.1%
39999.6 1
< 0.1%
34000 1
< 0.1%
18650.8 1
< 0.1%
14124 1
< 0.1%
13136.58 1
< 0.1%
12312 1
< 0.1%
11999.88 1
< 0.1%
11840.4 1
< 0.1%
11228.76 1
< 0.1%

NetSales
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct42009
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean793.11882
Minimum-38.3
Maximum555416
Zeros269495
Zeros (%)9.9%
Negative3
Negative (%)< 0.1%
Memory size41.5 MiB
2023-05-08T13:39:15.373243image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum-38.3
5-th percentile0
Q131.5
median108
Q3414
95-th percentile3490
Maximum555416
Range555454.3
Interquartile range (IQR)382.5

Descriptive statistics

Standard deviation3422.807
Coefficient of variation (CV)4.3156296
Kurtosis2533.9107
Mean793.11882
Median Absolute Deviation (MAD)97.54
Skewness31.18779
Sum2.1565979 × 109
Variance11715608
MonotonicityNot monotonic
2023-05-08T13:39:15.541826image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 269495
 
9.9%
35.5 10251
 
0.4%
17.65 10221
 
0.4%
602.55 10086
 
0.4%
55.29 9769
 
0.4%
35.3 8094
 
0.3%
35 8045
 
0.3%
65 7795
 
0.3%
17.33 7310
 
0.3%
37.92 7228
 
0.3%
Other values (41999) 2370842
87.2%
ValueCountFrequency (%)
-38.3 1
 
< 0.1%
-11.83 1
 
< 0.1%
-3.86 1
 
< 0.1%
0 269495
9.9%
0.02 1
 
< 0.1%
0.04 1
 
< 0.1%
0.08 31
 
< 0.1%
0.1 1
 
< 0.1%
0.14 1
 
< 0.1%
0.15 1
 
< 0.1%
ValueCountFrequency (%)
555416 1
 
< 0.1%
524810 1
 
< 0.1%
508800 1
 
< 0.1%
456000 1
 
< 0.1%
430000 1
 
< 0.1%
420800 2
 
< 0.1%
418425 1
 
< 0.1%
396725 6
< 0.1%
381600 1
 
< 0.1%
380000 1
 
< 0.1%

ReturnMRP
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct11784
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean83.692677
Minimum0
Maximum238999.99
Zeros2450236
Zeros (%)90.1%
Negative0
Negative (%)0.0%
Memory size41.5 MiB
2023-05-08T13:39:15.742281image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile122.81
Maximum238999.99
Range238999.99
Interquartile range (IQR)0

Descriptive statistics

Standard deviation935.54594
Coefficient of variation (CV)11.178349
Kurtosis4296.5919
Mean83.692677
Median Absolute Deviation (MAD)0
Skewness42.286642
Sum2.2757177 × 108
Variance875246.2
MonotonicityNot monotonic
2023-05-08T13:39:15.946232image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2450236
90.1%
37.92 1996
 
0.1%
17.65 1864
 
0.1%
284 1811
 
0.1%
42 1782
 
0.1%
602.55 1776
 
0.1%
17.33 1533
 
0.1%
518 1523
 
0.1%
259 1456
 
0.1%
35.3 1454
 
0.1%
Other values (11774) 253705
 
9.3%
ValueCountFrequency (%)
0 2450236
90.1%
0.9 2
 
< 0.1%
0.95 3
 
< 0.1%
0.96 11
 
< 0.1%
1 2
 
< 0.1%
1.06 10
 
< 0.1%
1.1 1
 
< 0.1%
1.17 3
 
< 0.1%
1.21 1
 
< 0.1%
1.27 4
 
< 0.1%
ValueCountFrequency (%)
238999.99 1
< 0.1%
160801.2 1
< 0.1%
148820 1
< 0.1%
143980 1
< 0.1%
136548 1
< 0.1%
123200 1
< 0.1%
112920 1
< 0.1%
107985 1
< 0.1%
105930 1
< 0.1%
101320 1
< 0.1%

GenericName
Categorical

Distinct2962
Distinct (%)0.1%
Missing151
Missing (%)< 0.1%
Memory size240.3 MiB
SODIUM CHLORIDE 0.9%
 
219097
PANTOPRAZOLE 40MG INJ
 
61812
MULTIPLE ELECTROLYTES 500ML IVF
 
60556
ONDANSETRON 2MG/ML
 
59346
PARACETAMOL 1GM IV INJ
 
55766
Other values (2957)
2262408 

Length

Max length287
Median length224
Mean length27.651424
Min length4

Characters and Unicode

Total characters75183806
Distinct characters76
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique138 ?
Unique (%)< 0.1%

Sample

1st rowESOMEPRAZOLE 20MG
2nd rowFENTANYL INJ 500MCG/10ML
3rd rowCALCIUM GLUCONATE
4th rowCHLORHEXIDINE GLUCONATE 0.2%W/V
5th rowDEXTROSE 5% W/V IV FLUID

Common Values

ValueCountFrequency (%)
SODIUM CHLORIDE 0.9% 219097
 
8.1%
PANTOPRAZOLE 40MG INJ 61812
 
2.3%
MULTIPLE ELECTROLYTES 500ML IVF 60556
 
2.2%
ONDANSETRON 2MG/ML 59346
 
2.2%
PARACETAMOL 1GM IV INJ 55766
 
2.1%
WATER FOR INJECTION 10ML SOLUTION 54835
 
2.0%
LIGNOCAINE HYDROCHLORIDE 2% INJ 44678
 
1.6%
PIPERACILLIN 4GM+ TAZOBACTAM 500MG 44082
 
1.6%
SODIUM CHLORIDE IVF 100ML 37441
 
1.4%
ENOXAPARIN 40MG 34938
 
1.3%
Other values (2952) 2046434
75.3%

Length

2023-05-08T13:39:16.177886image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
inj 685650
 
6.4%
633136
 
5.9%
chloride 428719
 
4.0%
sodium 408464
 
3.8%
tab 321515
 
3.0%
0.9 234731
 
2.2%
40mg 204112
 
1.9%
500mg 175076
 
1.6%
1gm 146629
 
1.4%
ivf 141227
 
1.3%
Other values (2858) 7388415
68.6%

Most occurring characters

ValueCountFrequency (%)
8069352
 
10.7%
I 5714572
 
7.6%
M 5100626
 
6.8%
E 4762437
 
6.3%
O 4384886
 
5.8%
A 4321909
 
5.7%
L 4159913
 
5.5%
N 3795551
 
5.0%
0 3189844
 
4.2%
T 3031757
 
4.0%
Other values (66) 28652959
38.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 57031467
75.9%
Space Separator 8070225
 
10.7%
Decimal Number 7631744
 
10.2%
Other Punctuation 1567847
 
2.1%
Math Symbol 769466
 
1.0%
Dash Punctuation 47122
 
0.1%
Lowercase Letter 35931
 
< 0.1%
Close Punctuation 12655
 
< 0.1%
Open Punctuation 12506
 
< 0.1%
Control 4816
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 5714572
 
10.0%
M 5100626
 
8.9%
E 4762437
 
8.4%
O 4384886
 
7.7%
A 4321909
 
7.6%
L 4159913
 
7.3%
N 3795551
 
6.7%
T 3031757
 
5.3%
R 2868123
 
5.0%
C 2760430
 
4.8%
Other values (16) 16131263
28.3%
Lowercase Letter
ValueCountFrequency (%)
i 4558
12.7%
o 3238
 
9.0%
e 3151
 
8.8%
a 3065
 
8.5%
l 2496
 
6.9%
t 2089
 
5.8%
n 2014
 
5.6%
r 1935
 
5.4%
c 1890
 
5.3%
d 1821
 
5.1%
Other values (15) 9674
26.9%
Decimal Number
ValueCountFrequency (%)
0 3189844
41.8%
5 1261350
 
16.5%
1 1106530
 
14.5%
2 784342
 
10.3%
4 395444
 
5.2%
9 297769
 
3.9%
3 243273
 
3.2%
6 177597
 
2.3%
7 101919
 
1.3%
8 73676
 
1.0%
Other Punctuation
ValueCountFrequency (%)
. 617865
39.4%
% 509132
32.5%
/ 420516
26.8%
, 18768
 
1.2%
& 1566
 
0.1%
Space Separator
ValueCountFrequency (%)
8069352
> 99.9%
  873
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 47073
99.9%
– 49
 
0.1%
Control
ValueCountFrequency (%)
3612
75.0%
1204
 
25.0%
Math Symbol
ValueCountFrequency (%)
+ 769466
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12655
100.0%
Open Punctuation
ValueCountFrequency (%)
( 12506
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 27
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 57067398
75.9%
Common 18116408
 
24.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 5714572
 
10.0%
M 5100626
 
8.9%
E 4762437
 
8.3%
O 4384886
 
7.7%
A 4321909
 
7.6%
L 4159913
 
7.3%
N 3795551
 
6.7%
T 3031757
 
5.3%
R 2868123
 
5.0%
C 2760430
 
4.8%
Other values (41) 16167194
28.3%
Common
ValueCountFrequency (%)
8069352
44.5%
0 3189844
 
17.6%
5 1261350
 
7.0%
1 1106530
 
6.1%
2 784342
 
4.3%
+ 769466
 
4.2%
. 617865
 
3.4%
% 509132
 
2.8%
/ 420516
 
2.3%
4 395444
 
2.2%
Other values (15) 992567
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 75182884
> 99.9%
None 873
 
< 0.1%
Punctuation 49
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8069352
 
10.7%
I 5714572
 
7.6%
M 5100626
 
6.8%
E 4762437
 
6.3%
O 4384886
 
5.8%
A 4321909
 
5.7%
L 4159913
 
5.5%
N 3795551
 
5.0%
0 3189844
 
4.2%
T 3031757
 
4.0%
Other values (64) 28652037
38.1%
None
ValueCountFrequency (%)
  873
100.0%
Punctuation
ValueCountFrequency (%)
– 49
100.0%

Category
Categorical

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size181.5 MiB
DRUGS
2719136 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters13595680
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDRUGS
2nd rowDRUGS
3rd rowDRUGS
4th rowDRUGS
5th rowDRUGS

Common Values

ValueCountFrequency (%)
DRUGS 2719136
100.0%

Length

2023-05-08T13:39:16.357792image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-08T13:39:16.499158image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
drugs 2719136
100.0%

Most occurring characters

ValueCountFrequency (%)
D 2719136
20.0%
R 2719136
20.0%
U 2719136
20.0%
G 2719136
20.0%
S 2719136
20.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 13595680
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
D 2719136
20.0%
R 2719136
20.0%
U 2719136
20.0%
G 2719136
20.0%
S 2719136
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13595680
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
D 2719136
20.0%
R 2719136
20.0%
U 2719136
20.0%
G 2719136
20.0%
S 2719136
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13595680
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
D 2719136
20.0%
R 2719136
20.0%
U 2719136
20.0%
G 2719136
20.0%
S 2719136
20.0%

SubCategory
Categorical

Distinct33
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size210.3 MiB
INJECTIONS
1183247 
TABLETS & CAPSULES
670016 
IV FLUIDS, ELECTROLYTES, TPN
468778 
SYRUP & SUSPENSION
 
89640
OINTMENTS, CREAMS & GELS
 
69245
Other values (28)
238210 

Length

Max length28
Median length25
Mean length16.103007
Min length3

Characters and Unicode

Total characters43786265
Distinct characters27
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowTABLETS & CAPSULES
2nd rowINJECTIONS
3rd rowINJECTIONS
4th rowLIQUIDS & SOLUTIONS
5th rowIV FLUIDS, ELECTROLYTES, TPN

Common Values

ValueCountFrequency (%)
INJECTIONS 1183247
43.5%
TABLETS & CAPSULES 670016
24.6%
IV FLUIDS, ELECTROLYTES, TPN 468778
 
17.2%
SYRUP & SUSPENSION 89640
 
3.3%
OINTMENTS, CREAMS & GELS 69245
 
2.5%
INHALERS & RESPULES 64098
 
2.4%
LIQUIDS & SOLUTIONS 44421
 
1.6%
POWDER 40087
 
1.5%
NUTRITIONAL SUPPLEMENTS 29194
 
1.1%
VACCINE 18426
 
0.7%
Other values (23) 41984
 
1.5%

Length

2023-05-08T13:39:16.630991image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
injections 1183247
19.3%
945655
15.5%
capsules 670016
11.0%
tablets 670016
11.0%
iv 468778
 
7.7%
fluids 468778
 
7.7%
electrolytes 468778
 
7.7%
tpn 468778
 
7.7%
syrup 89642
 
1.5%
suspension 89640
 
1.5%
Other values (39) 592847
9.7%

Most occurring characters

ValueCountFrequency (%)
S 5162034
11.8%
E 4563368
10.4%
T 4216536
9.6%
I 3769067
 
8.6%
3397039
 
7.8%
N 3373737
 
7.7%
L 3097037
 
7.1%
C 2431443
 
5.6%
O 2017651
 
4.6%
U 1540972
 
3.5%
Other values (17) 10217381
23.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 38436769
87.8%
Space Separator 3397039
 
7.8%
Other Punctuation 1952457
 
4.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 5162034
13.4%
E 4563368
11.9%
T 4216536
11.0%
I 3769067
9.8%
N 3373737
8.8%
L 3097037
8.1%
C 2431443
 
6.3%
O 2017651
 
5.2%
U 1540972
 
4.0%
A 1537619
 
4.0%
Other values (13) 6727305
17.5%
Other Punctuation
ValueCountFrequency (%)
, 1006801
51.6%
& 945655
48.4%
. 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
3397039
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 38436769
87.8%
Common 5349496
 
12.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 5162034
13.4%
E 4563368
11.9%
T 4216536
11.0%
I 3769067
9.8%
N 3373737
8.8%
L 3097037
8.1%
C 2431443
 
6.3%
O 2017651
 
5.2%
U 1540972
 
4.0%
A 1537619
 
4.0%
Other values (13) 6727305
17.5%
Common
ValueCountFrequency (%)
3397039
63.5%
, 1006801
 
18.8%
& 945655
 
17.7%
. 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 43786265
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 5162034
11.8%
E 4563368
10.4%
T 4216536
9.6%
I 3769067
 
8.6%
3397039
 
7.8%
N 3373737
 
7.7%
L 3097037
 
7.1%
C 2431443
 
5.6%
O 2017651
 
4.6%
U 1540972
 
3.5%
Other values (17) 10217381
23.3%

SubCategoryL3
Categorical

Distinct36
Distinct (%)< 0.1%
Missing9617
Missing (%)0.4%
Memory size235.5 MiB
INTRAVENOUS & OTHER STERILE SOLUTIONS
570608 
ANTI-INFECTIVES
366863 
GASTROINTESTINAL & HEPATOBILIARY SYSTEM
343922 
CARDIOVASCULAR & HEMATOPOIETIC SYSTEM
313090 
CENTRAL NERVOUS SYSTEM
278159 
Other values (31)
836877 

Length

Max length66
Median length38
Mean length25.988316
Min length8

Characters and Unicode

Total characters70415837
Distinct characters35
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowGASTROINTESTINAL & HEPATOBILIARY SYSTEM
2nd rowCENTRAL NERVOUS SYSTEM
3rd rowINTRAVENOUS & OTHER STERILE SOLUTIONS
4th rowEAR & MOUTH/ THROAT
5th rowINTRAVENOUS & OTHER STERILE SOLUTIONS

Common Values

ValueCountFrequency (%)
INTRAVENOUS & OTHER STERILE SOLUTIONS 570608
21.0%
ANTI-INFECTIVES 366863
13.5%
GASTROINTESTINAL & HEPATOBILIARY SYSTEM 343922
12.6%
CARDIOVASCULAR & HEMATOPOIETIC SYSTEM 313090
11.5%
CENTRAL NERVOUS SYSTEM 278159
10.2%
VITAMINS & MINERALS 142456
 
5.2%
RESPIRATORY SYSTEM 131322
 
4.8%
HORMONES 98809
 
3.6%
ANAESTHETICS 98699
 
3.6%
NUTRITION 87861
 
3.2%
Other values (26) 277730
10.2%

Length

2023-05-08T13:39:16.831741image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1430168
17.4%
system 1179223
14.4%
intravenous 570608
 
7.0%
other 570608
 
7.0%
sterile 570608
 
7.0%
solutions 570608
 
7.0%
anti-infectives 366863
 
4.5%
gastrointestinal 343931
 
4.2%
hepatobiliary 343922
 
4.2%
hematopoietic 313090
 
3.8%
Other values (55) 1940135
23.7%

Most occurring characters

ValueCountFrequency (%)
T 7346663
10.4%
S 7131971
10.1%
E 6973523
9.9%
I 6197439
 
8.8%
5490245
 
7.8%
O 5121720
 
7.3%
A 4680824
 
6.6%
N 4578334
 
6.5%
R 4456016
 
6.3%
L 2923223
 
4.2%
Other values (25) 15515879
22.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 63039964
89.5%
Space Separator 5490245
 
7.8%
Other Punctuation 1456921
 
2.1%
Dash Punctuation 428675
 
0.6%
Lowercase Letter 32
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 7346663
11.7%
S 7131971
11.3%
E 6973523
11.1%
I 6197439
9.8%
O 5121720
8.1%
A 4680824
7.4%
N 4578334
7.3%
R 4456016
 
7.1%
L 2923223
 
4.6%
M 2199189
 
3.5%
Other values (12) 11431062
18.1%
Lowercase Letter
ValueCountFrequency (%)
y 4
12.5%
u 4
12.5%
r 4
12.5%
v 4
12.5%
e 4
12.5%
d 4
12.5%
i 4
12.5%
c 4
12.5%
Other Punctuation
ValueCountFrequency (%)
& 1438965
98.8%
/ 15525
 
1.1%
, 2431
 
0.2%
Space Separator
ValueCountFrequency (%)
5490245
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 428675
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 63039996
89.5%
Common 7375841
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 7346663
11.7%
S 7131971
11.3%
E 6973523
11.1%
I 6197439
9.8%
O 5121720
8.1%
A 4680824
7.4%
N 4578334
7.3%
R 4456016
 
7.1%
L 2923223
 
4.6%
M 2199189
 
3.5%
Other values (20) 11431094
18.1%
Common
ValueCountFrequency (%)
5490245
74.4%
& 1438965
 
19.5%
- 428675
 
5.8%
/ 15525
 
0.2%
, 2431
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 70415837
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 7346663
10.4%
S 7131971
10.1%
E 6973523
9.9%
I 6197439
 
8.8%
5490245
 
7.8%
O 5121720
 
7.3%
A 4680824
 
6.6%
N 4578334
 
6.5%
R 4456016
 
6.3%
L 2923223
 
4.2%
Other values (25) 15515879
22.0%

Bill_Month
Real number (ℝ)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.4982656
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size41.5 MiB
2023-05-08T13:39:16.981612image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)7

Descriptive statistics

Standard deviation3.585137
Coefficient of variation (CV)0.55170675
Kurtosis-1.307667
Mean6.4982656
Median Absolute Deviation (MAD)3
Skewness-0.019958661
Sum17669668
Variance12.853207
MonotonicityNot monotonic
2023-05-08T13:39:17.112288image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
1 261265
9.6%
2 260863
9.6%
12 251629
9.3%
10 241769
8.9%
9 235343
8.7%
11 232622
8.6%
3 220717
8.1%
8 217232
8.0%
5 213263
7.8%
4 198405
7.3%
Other values (2) 386028
14.2%
ValueCountFrequency (%)
1 261265
9.6%
2 260863
9.6%
3 220717
8.1%
4 198405
7.3%
5 213263
7.8%
6 189628
7.0%
7 196400
7.2%
8 217232
8.0%
9 235343
8.7%
10 241769
8.9%
ValueCountFrequency (%)
12 251629
9.3%
11 232622
8.6%
10 241769
8.9%
9 235343
8.7%
8 217232
8.0%
7 196400
7.2%
6 189628
7.0%
5 213263
7.8%
4 198405
7.3%
3 220717
8.1%

Bill_Year
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size178.9 MiB
2022
900714 
2021
732418 
2020
464638 
2019
420421 
2023
200945 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters10876544
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row2019
5th row2019

Common Values

ValueCountFrequency (%)
2022 900714
33.1%
2021 732418
26.9%
2020 464638
17.1%
2019 420421
15.5%
2023 200945
 
7.4%

Length

2023-05-08T13:39:17.251030image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-08T13:39:17.405977image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
2022 900714
33.1%
2021 732418
26.9%
2020 464638
17.1%
2019 420421
15.5%
2023 200945
 
7.4%

Most occurring characters

ValueCountFrequency (%)
2 5918565
54.4%
0 3183774
29.3%
1 1152839
 
10.6%
9 420421
 
3.9%
3 200945
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10876544
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 5918565
54.4%
0 3183774
29.3%
1 1152839
 
10.6%
9 420421
 
3.9%
3 200945
 
1.8%

Most occurring scripts

ValueCountFrequency (%)
Common 10876544
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 5918565
54.4%
0 3183774
29.3%
1 1152839
 
10.6%
9 420421
 
3.9%
3 200945
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10876544
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 5918565
54.4%
0 3183774
29.3%
1 1152839
 
10.6%
9 420421
 
3.9%
3 200945
 
1.8%

Bill_Day
Real number (ℝ)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.67448
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size41.5 MiB
2023-05-08T13:39:17.552120image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.7520599
Coefficient of variation (CV)0.55836367
Kurtosis-1.189178
Mean15.67448
Median Absolute Deviation (MAD)8
Skewness0.014145068
Sum42621042
Variance76.598553
MonotonicityNot monotonic
2023-05-08T13:39:17.721014image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
7 92584
 
3.4%
21 92572
 
3.4%
9 92106
 
3.4%
17 91899
 
3.4%
16 91764
 
3.4%
18 91495
 
3.4%
11 91103
 
3.4%
13 91024
 
3.3%
4 91004
 
3.3%
6 90854
 
3.3%
Other values (21) 1802731
66.3%
ValueCountFrequency (%)
1 84785
3.1%
2 88102
3.2%
3 90689
3.3%
4 91004
3.3%
5 89504
3.3%
6 90854
3.3%
7 92584
3.4%
8 88500
3.3%
9 92106
3.4%
10 90010
3.3%
ValueCountFrequency (%)
31 47915
1.8%
30 79981
2.9%
29 81163
3.0%
28 90501
3.3%
27 88561
3.3%
26 87346
3.2%
25 86727
3.2%
24 90253
3.3%
23 88994
3.3%
22 88566
3.3%

Bill_Week
Real number (ℝ)

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.443931
Minimum1
Maximum53
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size41.5 MiB
2023-05-08T13:39:17.906498image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q112
median27
Q340
95-th percentile50
Maximum53
Range52
Interquartile range (IQR)28

Descriptive statistics

Standard deviation15.656445
Coefficient of variation (CV)0.59206191
Kurtosis-1.3079171
Mean26.443931
Median Absolute Deviation (MAD)14
Skewness-0.0099876765
Sum71904646
Variance245.12426
MonotonicityNot monotonic
2023-05-08T13:39:18.236699image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8 66354
 
2.4%
9 65741
 
2.4%
5 65566
 
2.4%
7 65204
 
2.4%
4 62772
 
2.3%
6 62467
 
2.3%
50 60057
 
2.2%
49 59083
 
2.2%
51 58320
 
2.1%
2 57737
 
2.1%
Other values (43) 2095835
77.1%
ValueCountFrequency (%)
1 57075
2.1%
2 57737
2.1%
3 57620
2.1%
4 62772
2.3%
5 65566
2.4%
6 62467
2.3%
7 65204
2.4%
8 66354
2.4%
9 65741
2.4%
10 53133
2.0%
ValueCountFrequency (%)
53 7256
 
0.3%
52 50637
1.9%
51 58320
2.1%
50 60057
2.2%
49 59083
2.2%
48 55518
2.0%
47 57001
2.1%
46 53979
2.0%
45 54066
2.0%
44 50013
1.8%

Interactions

2023-05-08T13:38:27.733267image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:32.984007image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:38.572418image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:42.984885image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:47.715791image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:53.293849image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:57.658557image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:02.182785image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:08.008555image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:12.429949image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:16.836196image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:23.116582image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:29.212945image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:33.537368image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:38.939157image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:43.351748image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:48.264967image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:53.680793image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:58.025567image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:02.718652image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:08.377638image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:12.810723image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:17.308359image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:23.504679image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:29.576974image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:34.052629image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:39.298215image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:43.726350image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:48.818547image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:54.072758image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:58.385764image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:03.272995image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:08.742337image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:13.178832image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:17.843292image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:23.940044image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:29.925923image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:34.550813image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:39.662535image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:44.057407image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:49.322216image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:54.422892image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:58.732474image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:03.801245image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:09.108549image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:13.536315image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:18.374104image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:24.314469image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:30.290299image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:35.094517image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:40.034219image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:44.390095image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:49.891292image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:54.775519image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:59.101888image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:04.272351image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:09.466493image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:13.890624image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:18.933889image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:24.688865image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:30.639921image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:35.634218image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:40.387822image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:44.727102image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:50.443514image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:55.155869image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:59.444157image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:04.768385image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:09.826231image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:14.242287image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:19.477433image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:25.055112image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:31.010078image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:36.194848image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:40.763607image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:45.071691image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:50.919692image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:55.502895image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:59.787004image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:05.274720image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:10.182606image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:14.610149image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:19.981744image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:25.412161image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:31.370918image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:36.589111image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:41.151892image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:45.439006image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:51.449007image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:55.872833image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:00.167340image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:05.828397image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:10.544789image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:14.985178image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:20.547213image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:25.818722image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:31.724804image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:36.983364image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:41.513797image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:45.794976image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:51.819252image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:56.227273image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:00.513331image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:06.356369image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:10.910158image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:15.344592image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:21.152815image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:26.196383image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:32.081820image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:37.378228image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:41.878295image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:46.164439image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:52.182678image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:56.570987image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:00.883613image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:06.838629image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:11.317500image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:15.708936image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:21.752508image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:26.581637image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:32.472283image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:37.791781image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:42.244454image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:46.617929image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:52.547166image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:56.938143image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:01.245700image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:07.248168image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:11.687292image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:16.083701image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:22.328949image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:26.955241image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:33.027421image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:38.188653image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:42.610073image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:47.152794image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:52.914932image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:37:57.306915image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:01.662473image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:07.626590image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:12.054496image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:16.448074image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:22.708964image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-08T13:38:27.335814image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Correlations

2023-05-08T13:39:18.545732image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
UHIDTQtyUCPwithoutGSTPurGSTPerMRPTotalCostTotalDiscountNetSalesReturnMRPBill_MonthBill_DayBill_WeekSalesTypePharmacyTypeSpecialisationNameDepartmentIsFormularyFormularySubCategorySubCategoryL3Bill_Year
UHID1.0000.015-0.0210.029-0.014-0.0010.0890.004-0.0110.0370.0010.0320.0380.1430.1990.1120.0870.0700.0280.0400.678
TQty0.0151.000-0.1260.109-0.1130.1880.1130.450-0.5470.015-0.0050.0140.0290.0330.0080.0410.0000.0040.0380.0250.015
UCPwithoutGST-0.021-0.1261.0000.0440.9540.8900.0970.698-0.0020.0100.0020.0090.0020.0030.0120.0030.0070.0040.0070.0250.003
PurGSTPer0.0290.1090.0441.000-0.0210.0830.7580.080-0.1560.0020.0010.0000.1360.2130.0540.1090.0080.0540.1000.0770.051
MRP-0.014-0.1130.954-0.0211.0000.8550.0360.7270.0190.0090.0020.0090.0030.0030.0110.0060.0080.0040.0070.0280.003
TotalCost-0.0010.1880.8900.0830.8551.0000.1460.790-0.0050.012-0.0000.0110.0020.0030.0110.0000.0080.0040.0060.0270.004
TotalDiscount0.0890.1130.0970.7580.0360.1461.0000.129-0.1360.0020.001-0.0010.0020.0040.0000.0060.0020.0070.0000.0040.002
NetSales0.0040.4500.6980.0800.7270.7900.1291.000-0.5160.006-0.0000.0060.0040.0040.0110.0030.0110.0040.0170.0280.004
ReturnMRP-0.011-0.547-0.002-0.1560.019-0.005-0.136-0.5161.0000.009-0.0010.0070.0190.0040.0000.0030.0040.0080.0030.0080.003
Bill_Month0.0370.0150.0100.0020.0090.0120.0020.0060.0091.0000.0160.9760.0170.0490.0970.0780.0480.0170.0130.0170.264
Bill_Day0.001-0.0050.0020.0010.002-0.0000.001-0.000-0.0010.0161.0000.0780.0060.0090.0230.0100.0080.0030.0040.0050.036
Bill_Week0.0320.0140.0090.0000.0090.011-0.0010.0060.0070.9760.0781.0000.0160.0470.0960.0740.0480.0180.0140.0170.260
SalesType0.0380.0290.0020.1360.0030.0020.0020.0040.0190.0170.0060.0161.0000.7150.3170.4580.0130.0590.2830.1710.054
PharmacyType0.1430.0330.0030.2130.0030.0030.0040.0040.0040.0490.0090.0470.7151.0000.3660.6610.0000.0880.4280.2540.094
SpecialisationName0.1990.0080.0120.0540.0110.0110.0000.0110.0000.0970.0230.0960.3170.3661.0000.2850.1210.0880.1380.1620.265
Department0.1120.0410.0030.1090.0060.0000.0060.0030.0030.0780.0100.0740.4580.6610.2851.0000.0610.0480.1720.1400.198
IsFormulary0.0870.0000.0070.0080.0080.0080.0020.0110.0040.0480.0080.0480.0130.0000.1210.0611.0001.0000.1460.1910.058
Formulary0.0700.0040.0040.0540.0040.0040.0070.0040.0080.0170.0030.0180.0590.0880.0880.0481.0001.0000.1500.2430.088
SubCategory0.0280.0380.0070.1000.0070.0060.0000.0170.0030.0130.0040.0140.2830.4280.1380.1720.1460.1501.0000.4010.049
SubCategoryL30.0400.0250.0250.0770.0280.0270.0040.0280.0080.0170.0050.0170.1710.2540.1620.1400.1910.2430.4011.0000.057
Bill_Year0.6780.0150.0030.0510.0030.0040.0020.0040.0030.2640.0360.2600.0540.0940.2650.1980.0580.0880.0490.0571.000

Missing values

2023-05-08T13:38:41.564395image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-05-08T13:38:49.865894image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-05-08T13:39:04.558969image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

SalesTypeUHIDPharmacyTypeSpecialisationNameDepartmentBillNoBillDateItemNameItem_CodeIsFormularyFormularyTQtyUCPwithoutGSTPurGSTPerMRPTotalCostTotalDiscountNetSalesReturnMRPGenericNameCategorySubCategorySubCategoryL3Bill_MonthBill_YearBill_DayBill_Week
2IP Dispense1.201800e+10IPLiver Disease and TransplantationMS IP PharmacyPDS19-086592019-01-30NEKSIUM (ESOMEPRAZOLE) 20MG TAB 1x10 PFIZERPH001036YesP1148.00063.0053.760.063.000.0ESOMEPRAZOLE 20MGDRUGSTABLETS & CAPSULESGASTROINTESTINAL & HEPATOBILIARY SYSTEM12019305
7IP Dispense1.201800e+10IPLiver Disease and TransplantationMS IP PharmacyPDS19-089232019-01-31FENSTUD (FENTANYL) 50MCG/10ML INJ RUSANPH001660YesP2369.000230.70231.840.0692.100.0FENTANYL INJ 500MCG/10MLDRUGSINJECTIONSCENTRAL NERVOUS SYSTEM12019315
12IP Dispense1.201800e+10IPLiver Disease and TransplantationMS IP PharmacyPDS19-089812019-01-31CALCIUM GLUCONATE 10ML INJ 1X1 HINDUPH000543YesP154.0005.8222.400.029.100.0CALCIUM GLUCONATEDRUGSINJECTIONSINTRAVENOUS & OTHER STERILE SOLUTIONS12019315
13IP Dispense1.201800e+10IPLiver Disease and TransplantationMS IP PharmacyPDS19-089812019-01-31CLOHEX MOUTH WASH 150ML DR REDDYSPH000604YesP1169.950102.0078.340.0102.000.0CHLORHEXIDINE GLUCONATE 0.2%W/VDRUGSLIQUIDS & SOLUTIONSEAR & MOUTH/ THROAT12019315
14IP Dispense1.201800e+10IPLiver Disease and TransplantationMS IP PharmacyPDS19-089812019-01-31DEXTROSE 5% 500ML CLARISPH000564YesP1216.50031.0236.960.062.040.0DEXTROSE 5% W/V IV FLUIDDRUGSIV FLUIDS, ELECTROLYTES, TPNINTRAVENOUS & OTHER STERILE SOLUTIONS12019315
16IP Dispense1.201800e+10IPLiver Disease and TransplantationMS IP PharmacyPDS19-089812019-01-31EFIPRES (EPHEDRINE) 1ML INJ NEONPH000354YesP1121.95028.0024.580.028.000.0EPHEDRINE 30MGDRUGSINJECTIONSCARDIOVASCULAR & HEMATOPOIETIC SYSTEM12019315
17IP Dispense1.201800e+10IPLiver Disease and TransplantationMS IP PharmacyPDS19-089812019-01-31FLEXBUMIN (HUMAN ALBUMIN) 20% 100ML INJ BAXTERPH000401NoNaN23804.0004765.007988.400.09530.000.0HUMAN ALBUMIN 20% INJDRUGSINJECTIONSINTRAVENOUS & OTHER STERILE SOLUTIONS12019315
21IP Dispense1.201800e+10IPLiver Disease and TransplantationMS IP PharmacyPDS19-089812019-01-31HAEMACCEL 500ML ABBOTTPH000560YesP14273.570383.001225.590.01532.000.0SODIUM CHLORIDE 0.85G + POTASSIUM CHLORIDE 0.038G + CALCIUM CHLORIDE 0.070GDRUGSIV FLUIDS, ELECTROLYTES, TPNINTRAVENOUS & OTHER STERILE SOLUTIONS12019315
23IP Dispense1.201800e+10IPLiver Disease and TransplantationMS IP PharmacyPDS19-089812019-01-31HUMAN ALBUMIN 5% 250ML INJ BAXTERPH000402YesP122700.0003054.005670.000.06108.000.0HUMAN ALBUMIN 5% 250ML INJDRUGSINJECTIONSINTRAVENOUS & OTHER STERILE SOLUTIONS12019315
24IP Dispense1.201800e+10IPLiver Disease and TransplantationMS IP PharmacyPDS19-089812019-01-31KABILYTE (MULTIPLE ELECTROLYTES) 500ML FLEXBAG FRESENIUSPH000561YesP14124.320196.90556.950.0787.600.0MULTIPLE ELECTROLYTES 500ML IVFDRUGSIV FLUIDS, ELECTROLYTES, TPNINTRAVENOUS & OTHER STERILE SOLUTIONS12019315
SalesTypeUHIDPharmacyTypeSpecialisationNameDepartmentBillNoBillDateItemNameItem_CodeIsFormularyFormularyTQtyUCPwithoutGSTPurGSTPerMRPTotalCostTotalDiscountNetSalesReturnMRPGenericNameCategorySubCategorySubCategoryL3Bill_MonthBill_YearBill_DayBill_Week
5643249OTC ReturnNaNOTCUrologyNaNSLR22-1445712022-09-14VSL 3 (LACTIC ACID BACTERIA+BIFIDOBACTERIA) CAP 1x10 SUN PHARMAPH001697YesP1-2273.4712403.00612.570.00.0725.40LACTIC ACID BACTERIA + BIFIDOBACTERIA CAPDRUGSTABLETS & CAPSULESGASTROINTESTINAL & HEPATOBILIARY SYSTEM920221437
5643250OTC ReturnNaNOTCUrologyNaNSLR22-1467782022-09-17CRANPAC-D (CRANBERRY AND D-MANNOSE) TAB 1X10S IPCAPH003483YesP1-3235.0918365.00832.220.00.0985.50CRANBERRY AND D-MANNOSEDRUGSTABLETS & CAPSULESNUTRITION920221737
5643251OTC ReturnNaNOTCUrologyNaNSLR22-1467782022-09-17EVALON (ESTRIOL) 15GM CREAM MSDPH002315YesP1-1314.1112462.90351.800.00.0416.61ESTRIOL 1MGDRUGSOINTMENTS, CREAMS & GELSHORMONES920221737
5643252OTC ReturnNaNOTCUrologyNaNSLR22-1467782022-09-17MARTIFUR (NITROFURANTOIN) 100MG TAB 1x14 WALTERPH001174NoNaN-256.701279.57127.010.00.0143.22NITROFURANTOIN 100MG TABDRUGSTABLETS & CAPSULESANTI-INFECTIVES920221737
5643253OTC ReturnNaNOTCUrologyNaNSLR22-585682022-04-26PEGMOVE (POLYETHYLENE GLYCOL) 119GM POW SUN PHARMAPH002730YesP1-2153.6012224.00344.060.00.0403.20POLYETHYLENE GLYCOL POWDERDRUGSPOWDERGASTROINTESTINAL & HEPATOBILIARY SYSTEM420222617
5643254OTC ReturnNaNOTCUrologyNaNSLR22-713192022-05-18CRESTOR (ROSUVASTATIN) 10 MG TAB 1x30 ASTRAZENECAPH002571YesP1-1445.7112624.00499.200.00.0561.60ROSUVASTATIN 10MG TABDRUGSTABLETS & CAPSULESCARDIOVASCULAR & HEMATOPOIETIC SYSTEM520221820
5643255OTC ReturnNaNOTCUrologyNaNSLR22-713192022-05-18FLAVEDON MR (TRIMETAZIDINE) TAB 1x10 SERDIAPH001302YesP1-189.7612130.90100.530.00.0117.81TRIMETAZIDINE 35MGDRUGSTABLETS & CAPSULESCARDIOVASCULAR & HEMATOPOIETIC SYSTEM520221820
5643256OTC ReturnNaNOTCUrologyNaNSLR22-848632022-06-10ARG9 SACHET 5GM NOUVEAU MEDICAMENTSPH002388YesP1-3330.281242.401119.150.00.01399.20L-ARGININE 3GM + PROANTHOCYANIDINS 75MG/GM POWDERDRUGSPOWDERVITAMINS & MINERALS620221023
5643257OTC ReturnNaNOTCUrologyNaNSLR22-940952022-06-26UDILIV (URSODEOXYCHOLIC ACID) 300MG TAB 1x15 ABBOTTPH000873YesP1-3397.215694.191251.210.00.01874.31URSODEOXYCHOLIC ACID 300MG TABDRUGSTABLETS & CAPSULESGASTROINTESTINAL & HEPATOBILIARY SYSTEM620222625
5643258OTC ReturnNaNOTCUrologyNaNSLR22-940952022-06-26URIMAX D (TAMSULOZIN 0.4MG + DUTASTERIDE 0.5MG) TAB 1x10 CIPLAPH000119YesP1-4375.5412525.741682.420.00.01892.68TAMSULOSIN 0.4MG + DUTASTERIDE 0.5MG TABDRUGSTABLETS & CAPSULESGENITO-URINARY SYSTEM620222625